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Abstract. A pedagogical account of some aspects of Extreme Value Statistics (EVS) is presented 
from the somewhat non-standard viewpoint of Large Deviation Theory. We address the following 
problem: given a set of N i.i.d. random variables {Xi ,..., Ajv} drawn from a parent probability 
density function (pdf) p{x), what is the probability that the maximum value of the set Vmax = 
maxi Xi is “atypically larger” than expected? The cases of exponential and Gaussian distributed 
variables are worked out in detail, and the right rate function for a general pdf in the Gumbel 
basin of attraction is derived. The Gaussian case convincingly demonstrates that the full rate 
function cannot be determined from the knowledge of the limiting distribution (Gumbel) alone, 
thus implying that it indeed carries additional information. Given the simplicity and richness of 
the result and its derivation, its absence from textbooks, tutorials and lecture notes on EVS for 
physicists appears inexplicable. 



Large deviations of the maximum of i.i.d. random variables 

1 . Introduction 


2 


Extreme Value Statistics (EVS) and Large Deviations Theory (LDT) are undoubtedly among 
the most solid and fertile theoretical masterpieces of modern probability theory. Developed 
independently over the course of several decades by top-class mathematicians, they have both 
gradually percolated into the domain of Statistical Physics (SP), to the extent that LDT is 
now recognized as the proper language in which SP formalism should be expressed, and cutting- 
edge research on the EVS of correlated variables is nowadays the bread and butter of dozens of 
colleagues. 

At odds with the widespread impact LDT and EVS have produced outside the realm of 
rigorous mathematics, physicists have been somehow reluctant to put together truly accessible 
and pedagogical accounts of their fundamentals, with the exception of highly commendable but 
isolated enterprises (see e.g. [1-3] for LDT - [4-8] for EVS - and references therein). One of 
the unfortunate consequences is that neither theory is typically taught or integrated in standard 
physics curricula around the globe. 

In their “classical” (textbook) descriptions, EVS primarily deals with (among other 
observables) the statistics of the maximum Xmax (or minimum) of a set of random variables 
{Xi,..., Xjv}, while LDT is concerned with atypical fluctuations of a random variable 
(depending on a parameter N) away from its expected value (S'at), which decay exponentially 
fast as the parameter N increases. LDT estimates are typically written in the form 


Prob[S'Ar < s] ~ 


exp , 

1 - exp (^-(:u^Vr(s) j , 


s < (Sn) 
s > {Sn) , 


( 1 ) 


where the nonzero left and right rate functions r(s) control the (exponentially small) probability 
that Sn takes values anomalously smaller or larger than (Sn), respectively. The symbol ~ in (1) 
stands for limAr^oo — lnProb[S'Ar < s]/a;^^ = ipi{s) and similarly on the right. Note that nontrivial 
limits '0£,r(s) can only be obtained by tuning the speeds and of the large deviation estimate 
to precise functions of X. As an example of this formalism, Sn may be taken to be the sample 
mean Sn = (l/-^)X]^i^* of independent and identically distributed (i.i.d.) random variables 
{Yi,..., Vat}, drawn from a common parent probability density function (pdf), see [2] for a set of 
instructive examples worked out in detail. 

From the exceedingly concise summary in the last paragraph, it is hard to speculate whether 
a connection between EVS and LDT should exist at all. They simply seem to target different 
attributes: “big” vs. “anomalously rare”. However, a moment of reflection should induce a quite 
natural question: what if the random variable Sn (subject to atypical fluctuations) is taken to be 
X max itselfl, instead of the sample mean of the Xj’s? In other words, what is the probability that 
the maximum of a set of random variables is “atypically larger” (or smaller) than its expected 
value? 

f Obviously, depends on the sample size N. 
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Problems of this ilk have been addressed at length in the context of a certain type of strongly 
correlated random variables, namely the eigenvalues of random matrices (see [9] and references 
therein). It felt just natural to assume that the problem for i.i.d. random variables (a priori 
simpler) must have been settled long before. 

Much to my surprise, I was able to retrieve only a single paper [10] where the EVS of i.i.d. 
random variables was looked at through the prism of LDT. The authors of [10] must have felt the 
same bewilderment as they wrote “We are not aware of any other work on extreme value theory 
with results formulated in this way.”. However, the formal style and the intended audience of [10] 
make it a tough reading for the uninitiated. 

I will argue here that this problem is at the same time rich, instructive and particularly 
simple (yet nontrivial) to deserve to be analyzed in detail and presented in a form accessible to 
an audience of trained physicists. This will be done by hrst introducing some preliminary notions 
(often not easy to hnd elsewhere) on “classical” EVS, keeping the style as informal as possible. 


2. Preliminaries on “classical” EVS for i.i.d. random variables 


Consider a collection of N i.i.d. random variables {Xi,..., Vat}, all drawn from the same 
continuous pdf p{x). We denote by P{x) the cumulative distribution function (cdf) of each of 
the Xj’s, P{x) = p dy p{y). Also, we denote the maximum of the set {Xt} by = maxi{Xj}. 
The cdf of X^ax (denoted in the following by Qn{.x)) can be easily written as 


Qn{x) = Prob[Xi„ax <x] = 


dxi ■ --dxNPixi) ■ ■ ■p(xjv) = 


dyp{y) 

Fix) 


N 


, ( 2 ) 


where one uses the fact that the maximum is smaller than x only if each of the variables is, and 
the independence of the variables. 

What happens now for N —)■ oo7 It is clear that limTv^oo Qn{x) for x hxed is disappointingly 
trivial: since 0 < P{x) < 1, the limit of P{x)^ can only take two possible values: 0 or 1. In 
order to obtain a nontrivial limiting distribution, one has to send both X, x —>■ oo, in such a way 
that the combination z = {x — aN)/bN is kept constant for suitably chosen centering and scaling 
constants oat G K and bj^ > 0, respectively. 

The standard goal of classical EVS can be summarized as follows: hnd on, b^ and F{z) (the 
latter independent of N) such that 


lim Qn^cln + b^z) = F{z) . (3) 

V—>-oo 

The celebrated Fisher-Tippett-Gnedenko theorem [11-13] states that F{z) can only be of 
three different types (Gumbel, Frechet and Weibull), depending on the right tail of the parent pdf 
p{x). Informally, if we denote by x* = sup(a: : P{x) < 1) the upper endpoint of the support of 
p{x) 
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• If X* is finite or infinite, and p{x) falls off faster than any power for x ^ x* (for instance 
in the exponential and Gaussian cases), then the limiting distribution F{z) is Gumbel, 
Fi{z) = exp(—exp(—; 2 )). 

• If X* is inhnite and p{x) falls off as a power law, p{x) ~ then the limiting distribution 

F{z) is Frechet, Fii{z) = if 2 ; > 0 and 0 otherwise. 

• If X* is hnite, for instance p{x) = 0 for x > 1 and p{x) ~ (1 — xy~^ when x —)• 1“ with 7 > 0, 
then the limiting distribution F{z) is Weibull, Fni(^) = for z < 0 and 1 otherwise. 

A more formal classihcation of basins of attraction can be found in [14], Theorem 1.2.1. In Fig. 1 
I plot the pdfs corresponding to the three classes above. 


m m fin(z) 



Figure 1: Left to right: the pdfs fi{z) (Gumbel), fii{z) (Frechet) and fm{z) (Weibull). 


The tail cumulative distribution function Qn{,x) = Prob[Xmax > x] = 1 — Qn{x) satishes 
obviously 

lim QN{aN + bNz) = 1 - F{z) , (4) 

N^oo 

with the same constants and bj^. 

I summarize here three results [15] that are all of practical importance, but hard to hnd 
simultaneously stated on the same page. 

(i) The constants and can be found as follows (P“^(x) denotes the functional inverse of 
the cdf P{x), if expressible in a closed form) 

(a) Gumbel 


On = P M 1 - ^ ) and 67 V = P M ^ ^ ' 


(b) Frechet 

(c) Weibull 


otv = 0 and = P W “ jy 


Otv = X* and b^ = x* — P ' ^ ~ [/y ) ’ 


( 5 ) 

( 6 ) 
( 7 ) 


where x* as before is the upper endpoint of the support of p(x) 
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(ii) Given a certain p{x), it is possible to predict to which “domain of attraction” (Gnmbel, 
Frechet or Weibull) its maximum belongs. Gompute the following limit 

P-1(1 - 2e) - P-1(1 - 4e) ^ ^ 

If c = 0, > 0, < 0, the domain of attraction is Gnmbel, Frechet or Weibull respectively. 

(hi) The constants {aN,bN} are not unique. If {aN,bN} are suitable centering and scaling 
constants for a given p{x), so are the constants b'^} provided the following limits hold 


lim ^ = 1 

Af->-oo b]y 

lim ^ 0 . 


N^oo 


JN 


( 9 ) 

( 10 ) 


I also recommend the following references [16,17] for an approach to EVS based on renormalization 
ideas and PDEs. 

I will mainly focus on the Gnmbel universality class in the following. In the next section, the 
exponential pdf will be used as a warm-up exercise to illustrate these basic notions, as well as the 
LDT treatment that was promised in the introduction. 

3. Warm-up: exponential pdf 

Gonsider for simplicity the case p{x) = pexp{—fix) for x > 0. 


3.1. Limiting distribution 
From the general formalism 

Qn{x) = 


/i / e 

. Jo 


N 


= [i 




( 11 ) 


As X —>■ oo, expanding the logarithm one gets 


Q„(x) « = e-'-'"-'""' = F,(z) , (12) 

if ^ = p.x — InA^. Here, Fi{z) is the Gnmbel cdf. This implies that qn = lniV//r and 6^ = l//i. 
Of course, one could have derived them recalling (5). The cdf for the exponential pdf is 


P(x) ^11 exp(-ijy)dy = 1 - e , 


(13) 


hence P ^(x) = —(l//r) ln(l — x) for 0 < x < 1. Therefore 


-1 / 1 \ IniV , , / 1 

" = = d-We 


aj\f — P 


_ 1 
— On — — 
/i 


(14) 
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as expected. In the next section, I exploit a rare luxury offered by the exponential pdf: the 
distribution of the maximum can be computed also at finite N. This offers the opportunity to 
understand at a somewhat deeper level the meaning of the centering and scaling constants and 
Ijn- 


3.2. Finite N and meaning of on and 

Let us compute the average and variance of for finite N. By definition 

/ii(7V) = = r dx x^Qn{x) = ^ + + ^ 

Jo ox /i 

where 7 = 0.577216... is the Euler-Mascheroni constant, and ip^'^fix) is the Polygamma function 
(x) = d"’'^(x)/dx"', where '^(x) = r'(x)/r(x) is the logarithmic derivative of the Gamma 
function). 

Taking the limit N ^ 00 , we find 


IniV 7 1 

~-1-h 


+ 


X —)■ cx) . 


( 16 ) 


/i 2/iX 12/iX2 

Now, one has + bj^z) —)■ Ffiz) for X —>■ 00 , where Ffiz) is the Gumbel cdf. The 

Gumbel pdf is 

h(z) = fF,(z) = e-—- , (17) 

whose average and variance can be computed as follows 


dz z e 


= 7 


ac = 


dz z e 


2 


7^ = 


TT 


(18) 

(19) 


The centering and scaling parameters oat and b^ were computed in the last section as oat = In X /fv 
and ^Af = l//r. 

Gomparing with (16), the leading term of the X —)■ cx) expansion of the first moment turns 
out to be precisely equal to a^, the centering parameter! This means that governs the average 
location of the maximum for large X. Moreover, the following holds 


lim 

N^oo 


Pi(x) - On 

bn 


lim 

N^qo 


/ii(X)-lnX//i 

l//i 


7 , 


( 20 ) 


i.e. the average of the Gumbel pdf! Therefore the parameter b^ ensures that the average location 
of the maximum in the large X limit is adjusted to the (nonzero!) average of the limiting pdf 
(Gumbel). 
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Let us now compute the second moment and the variance of Xjnax- One gets analogously: 

+ 1) + TT^ 


OO 1 

2 


^I 2 {N) = = dx X —Qn{x) = 


dx 


djjfi 


( 21 ) 


where = Ylk=i 1/^ is the iVth harmonic number. Computing now the variance 


Var.v„(lV) = A2(A') - (fii(lV))^ = 


2 TT^ — 6 + 1) 


6 /i^ 


( 22 ) 


which is exact for all N. Expanding for iV —)■ cx), we see that the variance saturates at a hnite 
value, namely 


Varx_(lV)~y^-^ + .... IV^oo. (23) 

Interestingly, the saturating value ^ has a quite natural interpretation as the product of i) the 
variance of the Gumbel pdf = vr^/b and ii) the square of 67 V = 1/h (the scaling parameter of 
the Extreme Value distribution). In summary 


lim 

N^oo 


Var-V.„(A^) 

bj, 


= a. 


G 1 


(24) 


implying that 67 V serves also the purpose of “shrinking” the width of the pdf of as much as 
needed to squeeze it under the envelope of the limiting (Gumbel) pdf for iV —)■ 00 . 

The properties (20) and (24) can be more compactly expressed using the notation X^ax ~ 
ttN + hjsfX) with X a Gumbel-distributed random variable. Standard properties (linearity and 
homogeneity) of cumulants then imply (X ^a x) ~ aAr + &v(x) (namely Eq. (20)) and Varx„,a^,,(X) = 
6 ^Var(x) (namely Eq. (24)). 


3.3. Large deviations 

I set p = 1 for simplicity in the following. So far I have considered the standard textbook treatment 
of EVS for exponential variates, which can be summarized in the statement 

Prob[Xmax > InX + z] ~ 1 — e“^ \ X —)■ 00 , (25) 

with z ~ C>{7) for large N. In the last subsection, = In N was shown to be the average location 
of the maximum in the large X-limit. Therefore, the “classical” statement (25) concerns typical 
0{1) fluctuations around the average value in the large N limit. 

It is then natural to ask instead the following different question: what is the probability that 
the maximum is “much larger” than expected, meaning that it deviates from In N (to the right) 
by an amount proportional to In X? 

In formulae. 


Prob[X^ax > (InX)e] =? 


e~(P(l) . 


(26) 
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Note that this is evidently a rare event! Its probability must decay quite fast as N increases. 
Still, it may not be completely clear at this stage whether the answer to the question in (26) is 
somehow implicitly contained already in the limiting statement (25). I will show later that this 
is not the case: the large deviation results cannot be in general deduced as a corollary of the 
limiting distribution alone, which holds on a much narrower scale (for small fluctuations around 
the average). 

Computing (26) is rather straightforward 

Prob[X^ax > (IniV)^] = 1 - Prob[X^a. < (IniV)^] = 1 - QAr((lniV)0 

= 1_ . (27) 

The claim (immediate to verify using (1 — N~^)^ 1 — for large N) is therefore 

-lnProb[X^..>(lniV)e] ff-l ^>l _ ^28) 

N^oo InA^ ^0 otherwise 

This simple result is expressed in a “standard” large deviation form, albeit with a quite unusual 
speed IniV, in contrast with the speeds N and that are customary for i.i.d. sample means 
and random matrix observables [9], respectively. The probability of a large fluctuation to the 
right of the expected value for the maximum, therefore, decays effectively as a power-law in N, 
Prob[^max ^ (l^-^)'b] 1/A^^ with exponent given by the right rate function fJriO = ■C ~ 1- 

It is useful to summarize the two (small and large) deviation results presented so far 

Prob[X^ax > In iV + z] ~ 1 - , (29) 

Prob[X^ax > (IniV)e] ^ , (30) 


with and f of 0{1) for large N. 

I will show now that an interesting “matching” occurs between the “most unlikely” among 
typical fluctuations (probed by the limit 2 ; 1 in (29)) and the “most likely” among atypical 

fluctuations (probed by the limit ^ ~ 1 in (30)). 

For z ^ 1 one has 

Prob[Xmax > \nN + z] ^ . (31) 


Setting now In -|- z ~ 
substituting in (30) 


(IniV)^, one obtains that in the matching regime f ~ 1 + z/\nN, and 


,-(lnV)(C-l) 


^^l+z/ In AT 


~ e 


(32) 


as in (31). Hence, the large deviation (30) when approaching IniV (~ 1) from the right on a scale 
of 0{1/ IniV) smoothly matches the far-right tail of the typical (limiting) distribution. I offer here 
two remarks, though: 
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(i) While this matching property is naturally expected to hold, and it does in a few other examples 
I know [9], it does not seem to have the status of a necessary/sufficient condition, encoded 
in a theorem (at least, not that I am aware of). This would be a very interesting research 
direction to pursue, though, much in the spirit of Bryc’s regularity condition for retrieving 
the Central Limit Theorem from the rate function [18]. 

(ii) Assuming that this matching must hold necessarily, it would have occurred for any rate 

function V’r(0 behaving as ^ — 1 for ^ 1: therefore the true rate function fJriO (among 

all the possibilities) cannot be deduced by this matching (i.e. by the behavior of the limiting 
distribution for z 3> 1) alone: one really has to compute the limit (28) “from scratch”! This 
will be all the more evident in the next case. 


4. Gaussian pdf 

An even more interesting case is the Gaussian pdf p{x) = \f^. We present in the following 

subsection a thorough derivation of the centering and scaling constants oat and b]\f for this case, 
as there are several subtleties that are worth highlighting. 


4 . 1 . Limiting distribution 
From the general formalism 



r°° p-y^/2 

N 

Qn{x) = 

1 / d?/ ^ 



-(1 + erf(x/\/2)) 


N 


(33) 




where the error function erf(z) = (2 /a/F) df e 

It is convenient to use the integral form for Qn{x) to derive the centering and scaling constants 
uat and 6jv- 

r o, N r 0,1 

poo poo 

(34) 


as for X —)■ cxD the integral gives a small contribution, and we can use (1 — e)'^ ~ 

For X —)■ 00 , the behavior of the integral /(x) = can be estimated as follows. 

Make a change of variables y = xr, yielding 



/■“ p-y^/2 

N 

r°° 

Qn{x) = 

1 / d?/ ^ 

Jx 

~ exp 

—N / dy — 1 =- 

Jx 


J(x) = X / dr e _ 


(35) 


The integrand is a fast decreasing function of r, so for large x the main contribution to the integral 
comes from the vicinity of the point r = 1. Expanding the function r^/2 in the exponent close to 
r = 1 as r^/2 = l/2 + lx (r — 1) + ... 


/(x) ~ xe -^'/2 / e-x 2 (r-i) _ 


,-A/2 


X 


for X —)■ +00 . 


(36) 
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Inserting it in (34) 


where 


Qn{x) ^ exp 


N e -^'/2 

X 


= e 


e-v>]v(^) 


(Pn{x) = — In 12 + ln(x) + (1/2) ln(27r) . 


(37) 

(38) 


This looks quite promising in terms of convergence to the expected Gumbel form. Note that it is 
not legitimate to drop the term 1/x with respect to in (36) (or, equivalently, to drop the 

ln(a;) and the constant in (38)) as one would be naively tempted to do. 

Setting now x = ajq + m. (38) and expanding we obtain 


(fNi^N + b^z) = - IniV + -a% + -^blfz‘^ + a^b^z + - ln(27r) + ln(aAr + bNz) . (39) 

Imposing that (39) should go as ~ 2 ; for —)■ cx) (as dictated by the Gumbel form e“® ^) gives 
the constraint 

bN = — . ( 40 ) 

On 

Next, in order to neutralize the term — InA^, I put forward the ansatz = a/2 In A^ + cat, 
obtaining 

1" ^ V2 In / + 0 J " + f + + In 1“ + V2 In / + c J ’ 

( 41 ) 

where I neglected the term z‘^/{2a\) which vanishes for ~ 0{1) and ajv going to inhnity when 
A^ —)■ 00 . 

Expanding the last logarithm, and neglecting the term 2;/(21nA^ + cn) I obtain 


LfN ^a/2 InA^ + Cat + 


a/2 In A^ + cn 


~ 2 ; + Y + ^ ln(27r) + ^ ln(2 In A^ + cjv) 

~ 2: + — + - ln(27r) + - ln(2 In A^) + - 

2 2 ^ ’ 2 ^ ’ 22\nN 


(42) 


The constant cn can now be determined by the condition 


Cn 1, / X 1 , / . 1 Cat — 21 nA^ln( 47 rlnA^) , , 

-1— ln(27r) H— ln(2 In A^) H-^-= 0 ^ Cat =-^-— — ln( 47 r In A^) . (43) 

2 2 ^ ’ 2 ^ ^ 22\nN l + 21nAr ^ ^ v ; 


In summary, the two centering and scaling constants for the Gaussian pdf are 


On 

bN 


= ^2 lnAr-ln( 47 r InA^) 
_ 1 

^2 lnAf-ln(47r In V) 


( 44 ) 
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One can use the conditions (9) and (10) to simplify these expressions. The claim is that one can 
equivalently use§ 




= V2 IniV 


ln(47r In TV) 
2V2 In TV 


b'N — 


(45) 


V2lnN 


Indeed, the limits 


b' 

lim = lim 

TV^oo TV^oo 


21nAr-ln(47r InA^) 


= 1 


lim 

TV^oo 


QjN 

Bn 


= lim 

TV^oo 


21niV 

y21nAf-ln(4;rlnAf)-v'2toiV + !|||fi 


I/V 2 lnAr-ln( 47 r In N) 


= 0 , 


(46) 

(47) 


as dictated by (9) and (10). 

I offer some remarks here. 


(i) It is not legitimate to drop the second term in (45) with the argument that it vanishes 
while the hrst diverges as N -^ 00 . This would be tantamount to claiming that a% = a/2 In iV 
is equally £t to stand as centering constant. But the limit 


On — a'L .. a/2 In iV — ln(47r In A^) — a/2 In iV 

hm -;- = hm ^- = —00 , (48) 

N^oo Bn 1 / a /2 IniV — ln(47r In N) 

in violation of the requirement (10). 

(ii) The correctness of the constants (45) can also be ascertained numerically. Mathematica is 
able to compute the limit limjv^.oo Qn{cl'n 'Lb'j^z) for a specihc value of z. Starting from (33), 
I write the following two lines of code 

Q[NN_, z_] := ((1/2) (1 + Erf[(Sqrt[2 Log[NN]] 

- Log[4 Pi Log[NN]]/(2 Sqrt[2 Log[NN]]) + z/Sqrt[2 Log[NN]])/Sqrt[2]]))"NN; 
Limit[Q[NN, 0.001], NN -> Infinity] 

» 0.368247 
Exp[-Exp[-0.001]] 

» 0.368247 

Similarly, one can further disprove the naive use of a% = a/2 IniV as a centering constant 
with the code 


Qfalse[NN_, z_] := ((1/2) (1 + Erf[(Sqrt[2 Log[NN]] 

+ z/Sqrt[2 Log[NN]])/Sqrt[2]]))~NN; 

Limit[Qfalso[NN, 0.001], NN -> Infinity] 

» 1 . 

Exp[-Exp[-0.001]] 

» 0.368257 

§ One often finds misprints in the (few) published resources where such constants are spelt out somewhat explicitly. 
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As for the exponential pdf, I now wish to address the probability of anomalously large fluctuations 
of the maximum to the right of the expected value (the centering constant ajv or a(y). More 
precisely, I wish to compute 

Prob[X^ax > =? (49) 

and how this probability decays for large N. 

The calculation can be performed easily 

n N 


Prob[A'max > = 1 - (5v(a)v0 = 1 


Q-y 






dy 


~ N 


Q-y 


“jv? 




dy . (50) 


The integral can be estimated in full analogy with I{x) in (36), yielding 


Prob[Xn,ax > a'jv^] - 




(51) 


Taking the logarithm on both sides, dividing by IniV (the speed) and replacing the definition of 
a^Y from (45), one obtains the formidable limit 


In Prob 


lim 

N^OO 


lim 






/ 


N^oo In N 


In 


In AT 

iVexp f^2lnN- 


ln(47r In N) 
2 V 2 IniV 




V 




= e-i = M0, e>i- (52) 




Again, the two (small and large) deviation results can be summarized as 

Prob [Xmax > a'iv + ~ 1 - e""" " , 

Prob [X^ax > a'^f] ^ , 

with and f of 0{1) for large X. The constants a'^ and h’^ are given in (45). 
Expanding the rate function fjriO = ~ 1 around ^ = 1, one obtains 

^,( 0 ~ 2 (^- 1 ) . 


(53) 

(54) 


(55) 

(56) 


For x) S> 1 one has 

Prob [Xmax > a)v + b'jqz] ~ e"^ . 

Setting now a'^ + h'j^z ~ one obtains that in the matching regime ^ ~ 1 + h'p^z/a'j^ ~ 

1 + 2;/(21 nX), and substituting in (54) with the expanded rate function (55) 




5~l+z/(21nAf) 


~ e 


(57) 
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as in (56). Hence, once again the large deviation (54) when approaching In (~ 1) from the right 
on a scale of OiXj IniV) smoothly matches the far-right tail of the typical (limiting) distribution. 

This example further conhrms that in any case the full rate function 'ipriO = ~ 1 could 

not have been predicted appealing to the matching property alone, which only requires that the 
expansion around f = lis~2(^ —1). Therefore the large deviations results (30) and (54) genuinely 
provide extra information, which is not carried by the limiting distribution (Gumbel) alone. 

It is also easy to deduce along the same lines that the rate function 'ipriO for the maximum 
of i.i.d. variables whose common pdf decays at inhnity as p{x) ~ is given by ijJriO = ~ I; 

in agreement with [10]. I am not aware of a similar LDT treatment for EVS of densities in the 
Frechet basin of attraction, while for the Weibull class this is also possible (with speed N), but 
somewhat much less interesting [10]. 

5. Conclusions 

In summary, having in mind an audience of theoretical and statistical physicists, I have presented 
some aspects of Extreme Value Statistics (restricted to the Gumbel basin of attraction) from 
the somewhat non-standard viewpoint of Large Deviation Theory. First, an introduction to the 
universality classes for the EVS statistics was given, and then the exponential and Gaussian parent 
pdf were worked out in detail. I pointed out some subtleties connected with the centering and 
scaling constants qn and for the Gaussian case, which are difficult to find discussed in the 
literature, and eventually the right rate function V’r(0 derived for large deviations of the 
maximum to the right of its expected value in both cases. Demonstrating a smooth matching 
between the far tail of the limiting distribution (small deviation) and the large deviation result, I 
stressed that the rate function cannot be deduced from the knowledge of the limiting distribution 
(Gumbel) alone, thus implying that it carries additional information. This will not come as a 
surprise for the reader familiar with the LDT for the maximum eigenvalue oi N x N Gaussian 
and Wishart random matrices [9]: the role of Gumbel is taken by the Tracy-Widom distribution 
there [19,20], while right and left rate functions (corresponding to speeds N and respectively) 
were independently derived using different strategies [21-23]. The corresponding (much simpler) 
result for i.i.d. variables, which is at the same time cute and instructive, seems to deserve a better 
fate than the oblivion it has fallen into. 
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